Unraveling complex temporal associations in cellular systems across multiple time-series microarray datasets

نویسندگان

  • Wenyuan Li
  • Min Xu
  • Xianghong Jasmine Zhou
چکیده

Unraveling the temporal complexity of cellular systems is a challenging task, as the subtle coordination of molecular activities cannot be adequately captured by simple mathematical concepts such as correlation. This paper addresses the challenge with a data-mining approach. We introduce the novel concept of a "frequent temporal association pattern" (FTAP): a set of genes simultaneously exhibit complex temporal expression patterns recurrently across multiple microarray datasets. Such temporal signals are hard to identify in individual microarray datasets, but become significant by their frequent occurrences across multiple datasets. We designed an efficient two-stage algorithm to identify FTAPs. First, for each gene we identify expression trends that occur frequently across multiple datasets. Second, we look for a set of genes that simultaneously exhibit their respective trends recurrently in multiple datasets. We applied this algorithm to 18 yeast time-series microarray datasets. The majority of FTAPs identified by the algorithm are associated with specific biological functions. Moreover, a significant number of patterns include genes that are functionally related but do not exhibit co-expression; such gene groups cannot be captured by clustering algorithms. Our approach offers advantages: (1) it can identify complex associations of temporal trends in gene expression, an important step towards understanding the complex mechanisms governing cellular systems; (2) it is capable of integrating time-series data with different time scales and intervals; and (3) it yields results that are robust against outliers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MINING FUZZY TEMPORAL ITEMSETS WITHIN VARIOUS TIME INTERVALS IN QUANTITATIVE DATASETS

This research aims at proposing a new method for discovering frequent temporal itemsets in continuous subsets of a dataset with quantitative transactions. It is important to note that although these temporal itemsets may have relatively high textit{support} or occurrence within particular time intervals, they do not necessarily get similar textit{support} across the whole dataset, which makes i...

متن کامل

A hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts

High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...

متن کامل

Learning Multiple Temporal Matching for Time Series Classification

In real applications, time series are generally of complex structure, exhibiting different global behaviors within classes. To discriminate such challenging time series, we propose a multiple temporal matching approach that reveals the commonly shared features within classes, and the most differential ones across classes. For this, we rely on a new framework based on the variance/covariance cri...

متن کامل

Schemas of Clustering

Data mining techniques, such as clustering, have become a mainstay in many applications such as bioinformatics, geographic information systems, and marketing. Over the last decade, due to new demands posed by these applications, clustering techniques have been significantly adapted and extended. One such extension is the idea of finding clusters in a dataset that preserve information about some...

متن کامل

Multiple gene expression profile alignment for microarray time-series data clustering

MOTIVATION Clustering gene expression data given in terms of time-series is a challenging problem that imposes its own particular constraints. Traditional clustering methods based on conventional similarity measures are not always suitable for clustering time-series data. A few methods have been proposed recently for clustering microarray time-series, which take the temporal dimension of the da...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of biomedical informatics

دوره 43 4  شماره 

صفحات  -

تاریخ انتشار 2010